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In the evaluation of Project Information Packages 


(PIPs), a content ahalysis was performed to detect congruence’ between 


iteas 


in a ncrm-referenced test and the content in six exemplary 


coapensatory education prcgram curricula. Gains on congruent iteas 
were used to assess the effectiveness of the programs. Preliminary 
results show that the amount of congruence+was too small to make 
strong inferences, but that gains on congruent items were slightly 
higher in well-iaplemented pyograms. The procedures can be easily 
replicated for evaluations which require that gains on 
nora-referenced tests be the major criteria for success. (Author) 
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ABSTRACT ° 
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In the evaluation of Project Information Packages (PIPS) the content 
validity of the Metropolitan.Achievenent Test was examined by search- 
ing for congruence ps test items, the content in the project 
curricula, and. the curricula taught to the pupils. /Patterns of ach- 


ievement were used as descriptive indices of program effectiveness. 
A model analysis design is presented for replicatjon in other eval- 
uations which require gains on norm-referenced tests to be the major 


criteria. for program success. 


_ BACKGROUND ; , i 


“A methodology was developed to establish congruence between 
the knowledge and skills tested: by a norm- referenced achievement 
test (MAT '70) and the content in the curricula specified in six, 
exemplary, compensatory education programs. While it was pursued 
within the evaluation of the field test of Project Information 

* Packages (PIPs), the methodology bears directly ‘upon a familiar 
dilemma in evaluation of field studies. 

‘In many specially funded programs an evaluator is obliged to s 
show gains on a norm-referenced achievement test as evidence of 
success # Gains can only be expected if the content in the test 
matches the curriculum taught in the program, This expectatiqn 
follows from a fundamental assumption in constructing an achieve- 
ment test, namely content validity. * The content, in toto, while t 
representative of a spectrum of achievement in classrooms across 
the U.S., may not be valid for pupils in a Special] program. ‘In- 
stead, only a subset of thé test items might actually have been 
taught in a particular program. Logically, then,.gains on the - 
total score do not represent a reasonable measure of the partic- 
ular program. , , 

In the first phase of this evaluation few. gains were found 
at. any grade for the six programs (Stearns). One possible ex- 
‘planation for this failure could be the lack of content. validity 
of the norm- referenced test.* A search for congruence would test ‘ 

his possibility. 

A methodology to establish congruence would ‘require two pro- 
cedures: one to describe the content. in the test and the other to 
détermine what content was taught. Few systematic studies of 
the congruence between test items and curricula have been conducted. — 
In an earlier stfidy congruence between a norm-referenced test 
‘(Coop. Prim.) and state texts (Ca.) for first grade vocabulary was 
reported at only 55% (by permission, Bianchini). For vocabulary 

* ¢he researcher could search for exact matches; that strategy would 


So er ; : 
Since content validity may also be related to differences between . 
the norm group and the group that was tested, the avenue was ex- 

; eeplares in’ the evaluation of PIPs, too. 
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: not. work for an a the content in this test battery, especially 
reading comprehension and math problen-solving. In this study we 
also looked at the content outlines provided by the publisher; they 


‘were not sufficiently precise. It was necessary to formulate our 


own crite? and descriptions of the content in each item. 

+ When the\new science curricula was introduced’, teachers re- 
“viewed the coll e board achievement tests and marked the items 
covered, in their courses. Some part-whole scales were then pro- 
vided to show congruence for the new and traditional curricula (eg. 
Malcolm and Watkins). That method was rejected for our study, 
because we sought "naturalistic" data, ie., a source of data . 
typically found ina particular setting and unbiased by foreknowe—~— 
ledge of the purpose of an eventual collection¢ The project 
- teachers were requested to turn in their daily lesson plans ‘and 
we did nat explain the rationale for the Bilection. These records 
were used to determine what was taught. 
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METHOD - x : 
Congruence: the following tasks were completed to establish 


congruence: (1) rules ‘were formulated to describe the knowledge and 
skills in each item in each subtest; (2)-the curricula in the projects 
was searched for the presence of these rules. and appropriate content 
keyed to each item; (3) the curricute entered on a daily record for 
each pupil was Tisted; and (4) when the curricula taught to each — 
pupil overlapped with the curricula keyed to a test item, a match 
was declared. ' . 
Rules: The rules for each item were formulated to satisfy two 
criteria, First, each item was viewed as a single, independent ‘skill - 
and as many discrete features of an item as we could determine were 
specified. Second, a strategy of "near transfer" was adopted. An 
of the features had to’be represented in the curricula exactly as 
they were found in the test format. It seemed likely that the con- 
tent in the ‘word knowledge and arithmetic computation items could be 
found reproduced almost exactly in the curriculum,:as they were in 
the earlier study cited (Bianchini); the only variation might occur 
in the actual values Of _the numbers in math computation. 
The test publishers posed the item questions in the reading 
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“comprehension test quite consistently, and ru 
for matching those items with the cont ~ Since the. questions are 
associated with a weading pa: Ge, rules for the passage as well as 
rules for the kind_ juestion the item posed were linked and ranges 


for match Fe established. Existing quantitative rating systems, 


fere constructed 


' such @s the Dale-Chall readability formula and average sentence 


length, were selected. Such measures are unquestionably rough in- 
dices of comprehension levels, but they can be computed clearly.” 
Thus, therule for items in the reading comprehension test specified 
the type of item (main idea,’ Viteral, inferential) and features, of 
the associated passage. 

For a small percentage of the items no verifiable rule could 
be constructed. The following rule illustrates the application of 
these criteria and the type of description for one category of math ° 
story problem. — 


ITEM: Which of these wwe: he i numker. of ltne. segments? “ah 


RULE: In order to respond correctly to t a ttem a puptl must have 
been taught 


(a) recognition of planar shapes: hebuieiib rectangle, on 
and hexagon ' ; 


(al) sane scale er 
(b) concepts: counting, least nwnber, line segments 


(ec) vocabulary (sight recognition and understanding): figures, 
least, nwnber, line, segments 


Pupil records: . From the daily lesson plans of the teachers usable 
data was extracted; incomplete, illegible entries were tabulated and 
used only to keep track of the sample size of usable data relative to 
the entire collection. The use of the remaining content was then 
verified;. in reading programs, for instance, in which aides read aloud 

0 pupils, it was ascertained that pupils: also read Silently to them- 
selves, as would be required in the, test setting. Since documents, like 
lesson plans, might not reveal the degree to which a skill was taught, 
criteria were set to compensate for variations among teachers as well 

\ 


: ; \ 
In various studies the difficulty index derived from the Daté- 


Chall formula has accounted for as much as 50% of heer eerie) variance. | 
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as standardizing, the impact of curricula across test items, A 
particular word was considered taught, for example, if the records 
showed: a pupil was exposed twice to currtcula that contained” the word 
in a well-marked exercise. Other heuristics were developed for content 
' in reading and arithmetic. 

’ Much of the effort in the analysis of. the curriculum was absorbed 
by fragmenting the curricula into units that would match single items 
on thé test. Sometimes one booklet covered only oné skill, while in 
another series different skills appeared on the same page. In the 
formér, an entry was listed to mark the beginning and ending pages., 
whereas ,° in the latter,each problem was noted. Thus, a file was 
built, showing the recommended: curricula taught to each pupil. 
was: merged with another file, showing which curricula was, toe. 
to the test. items. 

Analysis: For those pupils ‘es took the “same test in the fall 
~ and | spring patterns of achievement (fail-pass, pass-pass, pass-fail, 
fail- fail) ‘were tallied ,to compare performance on congruent and non- 
congruent items. A model factorial design was later devised to 
incorporate the variables which appear: to influence the patterns of 
achievement. 


RESULTS ~ 


As of this writing, only preliminary results are available. The 
amount of congruence (item x content x pupil) appears to be very small, 
5-20%, and decreases with grade level, except in the subtest arithmetic 
computation. Some gains appear for that, test. 

The results suggest that the following factors could be included : 
fn a model design: (1) Between-subjects: project, Sohne quality, 'type 
of materials, teacher-pupil arrangement, and grade; (2) Within-subjects: 
subtest, congruence of: item. The univariate dependent measure would be 
the pattern of achievement (Another-variate, item difficulty, ‘could be 
added for multivariate analysis. Alternatively, achievement patterns 
could ‘be split so that entering performance (fall) could be viewed as 
an independent variable. ) Possible levels for these factors will be 


*sample patterns will*be available for display at the session. 
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eoucastot IMPLICATIONS 


This method for establishing, congruence confirms our expectation; 
on some subtests pupils learned and, therefore, gained slightly on 
some. items which they were taught.\ These procedures can be easily 
replicated for evaluation of similar programs, either national or 
localy-in which norm-referenced tests are required and used as the 
"main criteria*for success. . 
.Certainly, some educators and lay persons will not accept 
evidence of success which rests upon a smal] percentage of. test items, 
. regardless of the amount of gains within that percentage. When 
‘technical constraints are satisfied and issues of use and interpretation 
resolved, size itself becomes a philosophical preference. We can 
only speculate about: the causes for the lopsided congruence/non- 
congruence in this project. Perhaps, instructional guidance in the 
packages was weak or the level of pupil performance and‘ complementary 
curricula lay far below the range on the test. Hopefully, replica- 
tions of this methodology would reveal diffefences clearly attributable 
to some factors in the mddel design. ; a, 
Further analyses to compare the test-retest reliability of per- 
formance between congruence and“non-congruent items would lend stronger 
support to the methodology when the patterns of achievement are 
‘different and congruence/non-congruence lopsided. At the same time, 
this next step could address the question of success with reference 
to growth nationally, the purpose, after all, of using a norm-referenced 
test om an- evaluation. ® 


APPENDIX 


Table. 1 


THE NUMBER OF MAT ITEMS CONGRUENT WITH PIP+SPECIFIED 
MATERIALS AT THE FOURTH AND EIGHTH GRADES 


TOTAL MATHEMATICS 
Fourth: Grade 


tT=50 °° |T=45 Ke, 
; _ [Nei 


N = 56% of T N = 37.7% of T 


| Eighth Grade 


T = 50 
N= 2 |Ne gf (Nas 
N= @0% of T | N= 6.6% nf T : N = 14.2% of T 


Key: 
_T = Total number of items in MAT subtest 
N = Number of items known to be congruent 
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(no “congruent response pattérfs) ae : 7 a ’ 

45 133 ,8L 27] Jai 286 
Comprehension |, +16. ° *,47 .28 = .09 | °.22 20 
«2 ¢ . 322 Ble (2 a™ 2 3. 

{only 3 congruent response patterns) ' 
Math ; sf response. pattprh ible): , 
Computation | tie if MP orig ead . Sf oe . , 


Math 
Concepts 


Pre-Post Response Patterns: FP (Fail/Pass); PP (Pass/Pass); FF (Pail/Fail); PF (Pass/Fail) 
Line 1 = Number ‘of specified response pattern for a subset (across all PIPs), ¢.g. number of FP for a good teacher/congruent items _ 


Line 2 = Specified Response Pattern ; . wy ° 
7 Total Response Patterns for Subset Ps . ; : i 
wy Line 3 = Rank order of response patterns within subset . : : 
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eae ae ie POSSIBLE MODELS FOR ANALYSES . | 
a i bi 1 Program (child = unit ‘of analysis) a “. S 
iv ~~ : * : ; e , : ® 
Awe * . + Type: Factorial . iS 
ogi ANOVA (all factors are ‘categorical) gj? 4 - | 
. Indepengent Variables “os Se ‘ i 
ie. ove Between-sub jects oe 2 
a Be a Project. ' . : . | 
; ‘ Sehool %Y : 
e. A. 4.2 , School. quality . { 
ee ’ oe alias: Na : 
oe ? , © Grade 
oy : Materials SF : | 
” F r ‘ ; ‘ = 
i iehigcaileseate . ‘ 
Subtest ~ ne a oe mf 
Congruence of item, 1 (for pupil) 
; Materials "4h 
as Dependent Variable’ hey a ae . : 
i Bye ' Patterns of Fall-Spring Pass/Fail 
4 Fr a hg _ ¥ 
Rog ‘ . : s . é ‘ oy 
“sw LL Norm-Referenced (item = unit of analysis} - , - 7 
pe . Type: Multiple regression (ategdu) Y P 
Fe AG . One equation for congruent and one equation f noncongruent 
. 4 Independent Variables / : 
in . Te ar 5 oo ; . , E 
44? X, # children with fail-pats = oy 
X2 # children with pass-pass : ‘ 
* + Xq@ children with fail-fail — . 
a X, # children with pass-fail > 
z P . 
il Dependent Variable , . ; 
‘ nN . Yj mean percent difficulty of item t, 
es : : J 
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